BiRe-ID: Binary Neural Network for Efficient Person Re-ID

151

BN

BiConv.

BN

PReLU

BN

BiConv.

BN

PReLU

FC

FC

Cross

Entropy

FR-GAL

܉௜ିଵ

Kernel Refining GAL

ݏ݅݃݊ሺڄሻ

܊܉೔షభ

ܶܧ

ܟ

ݏ݅݃݊ሺڄሻ

܊ܟ

ܶܧ

ߙ

ל

ٖ

܉

Discriminator

MSE loss

܉

Feature Refining GAL

Low-level

Feature

Discriminator

MSE loss

܉

܉

כ

High-level

Feature

ͳ ൈͳ ܥ݋݊ݒǤ

݂ሺڄሻ

FIGURE 6.1

An illustration of BiRe-ID based on KR-GAL and FR-GAL, applying Kernel Refining

Generative Adversarial Learning (KR-GAL) and Feature Refining Generative Adversarial

Learning (FR-GAL). KR-GAL consists of the unbinarized kernel wi, corresponding bina-

rized kernel bwi, and the attention-aware scale factor αi. αi is employed to channel-wise

reconstruct the binarized kernel bwi. We employ conventional MSE loss and a GAN to fully

refine wi and αi. FR-GAL is a self-supervision tool to refine the features of the low-level

layers with the semantic information contained by the high-level features. To compare the

features of the low- and high-level parts, we employ a 1×1 convolution and nearest neighbor

interpolation f(·) to keep the channel dimension identical. Then the high-level features can

be utilized to refine the low-level feature through a GAN.

6.2

BiRe-ID: Binary Neural Network for Efficient Person Re-ID

This section proposes a new BNN-based framework for efficient person Re-ID (BiRe-

ID) [262]. We introduce the kernel and feature refinement based on generative adversarial

learning (GAL) [76] to improve the representation capacity of BNNs. Specifically, we ex-

ploit GAL to efficiently refine the kernel and feature of BNNs. We introduce an attention-

aware factor to refine the 1-bit convolution kernel under the GAL framework (KR-GAL).

We reconstruct real-valued kernels by their corresponding binarized counterparts and the

attention-aware factor. This reconstruction process is well supervised by GAL and MSE

loss as shown in the upper left corner of Fig. 6.1.

Furthermore, we employ a self-supervision framework to refine the low-level features

under the supervision of the high-level features with semantic information. As shown in

the upper right corner of Fig. 6.1, we use a feature-refining generative adversarial network

(FR-GAL) to supervise the low-level feature maps. In this way, the low-level features will

be refined by the semantic information contained in the high-level features to improve the

training process and lead to a sufficiently trained BNN.

6.2.1

Problem Formulation

We first consider a general quantization problem for deeply accelerating convolution oper-

ations to calculate the quantized or discrete weights. We design a quantization process by